Wrap fallback submodules for Inductor in a new `nn.Module` #2551

shino16 · 2025-09-29T15:17:05Z

Fixes #2539, which in turn fixes #2527 and #2501. This makes PR #2538 obsolete.

As in #2527 (comment), when we pass a GraphModule to torch.compile it doesn't get lowered properly. As a workaround, this PR wraps the fallback GraphModule in a new nn.Module. This workaround was found in #2527 (comment) (credit @mattteochen).

Alternatively we could call torch._inductor.compile_fx.compile_fx directly to compile the GraphModule, but as it returns a bare forward function instead of a nn.Module instance, it has troubles with registering as a submodule to the outer GraphModule.

shino16 · 2025-09-29T15:56:57Z

The test failures are coming from this:

import torch

class GraphModule(torch.nn.Module):
    def forward(self, y: "f32[2, 2]"):
        # No stacktrace found for following nodes
        _enter_autocast = torch.amp.autocast_mode._enter_autocast('cpu', None, True, None)

         # File: /opt/pytorch/lightning-thunder/thunder/tests/test_dynamo.py:280 in func, code: y = torch.sinc(y)
        y_1: "f32[2, 2]" = torch.sinc(y);  y = None

        # No stacktrace found for following nodes
        _exit_autocast = torch.amp.autocast_mode._exit_autocast(_enter_autocast);  _enter_autocast = _exit_autocast = None
        return y_1

model = GraphModule()
y = torch.randn(2, 2)
torch.compile(model)(y)

  File "/usr/local/lib/python3.12/dist-packages/torch/_dynamo/convert_frame.py", line 338, in _fn
    assert guards.check(), (
           ^^^^^^^^^^^^^^
AssertionError: Global autocast state changed while dynamo tracing, please report a bug

shino16 · 2025-09-30T12:06:51Z

PyTorch checks that the global state is kept unchanged while tracing (ref). torch.autocast() context manager circumvents this by reverting it in its cleanup process (ref), but torch.amp.autocast_mode._enter_autocast does not have such mechanism, causing the change in global state.

It may be better to rely on compile_fx instead of wrapping the GraphModule and tracing it again.

shino16 added 6 commits September 29, 2025 04:33

Add test

a84bd23

Improved comments

52d06aa

Resolve review: small tensor for tests

add4688

Wrap in submodule

128cc86

Update test

5bcd895

Add test for Inductor fallback

8778e40

shino16 requested review from kiya00 and kshitij12345 September 29, 2025 15:17

shino16 requested review from mruberry, lantiga, t-vi and KaelanDt as code owners September 29, 2025 15:17

Revert an unrelated change

ec67c18

shino16 force-pushed the wrap-inductor-submodule branch from f3311ea to ec67c18 Compare September 29, 2025 15:23

Fix typo

4cdaec7

shino16 marked this pull request as draft September 30, 2025 10:39

shino16 removed request for kiya00 and kshitij12345 September 30, 2025 12:07

mattteochen mentioned this pull request Sep 30, 2025

Activation checkpoint not working inside Inductor-compiled submodules #2527

Open

shino16 mentioned this pull request Oct 3, 2025

Use torch._inductor.compile for ThunderFX fallback entrypoint #2600

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wrap fallback submodules for Inductor in a new `nn.Module` #2551

Wrap fallback submodules for Inductor in a new `nn.Module` #2551

shino16 commented Sep 29, 2025

Uh oh!

shino16 commented Sep 29, 2025

Uh oh!

shino16 commented Sep 30, 2025

Uh oh!

Uh oh!

Wrap fallback submodules for Inductor in a new nn.Module #2551

Are you sure you want to change the base?

Wrap fallback submodules for Inductor in a new nn.Module #2551

Conversation

shino16 commented Sep 29, 2025

Uh oh!

shino16 commented Sep 29, 2025

Uh oh!

shino16 commented Sep 30, 2025

Uh oh!

Uh oh!

Wrap fallback submodules for Inductor in a new `nn.Module` #2551

Wrap fallback submodules for Inductor in a new `nn.Module` #2551